Atom AI Labs - AI-Powered Multi-Tenant Platform

Architecture Comparison: Atom SaaS vs Open Source (atom-upstream)

**Date:** March 31, 2026

**Purpose:** Deep-dive architectural comparison between the main Atom SaaS repository and the atom-upstream open-source submodule

---

Executive Summary

This document provides a comprehensive architectural analysis comparing the **Atom SaaS** (main repository) and **atom-upstream** (open-source submodule) codebases. While both share common DNA, they have diverged to serve different deployment models:

**atom-upstream**: Single-tenant, self-hosted platform optimized for personal and enterprise self-deployment
**Main Repo (SaaS)**: Multi-tenant SaaS platform with tenant isolation, billing, and enterprise governance

---

1. High-Level Architecture Comparison

1.1 Architecture Philosophy

atom-upstream (Open Source)

┌─────────────────────────────────────────────────────────────┐
│                  Single Tenant Instance                      │
│                                                              │
│  ┌────────────────────────────────────────────────────────┐ │
│  │                   Frontend Layer                        │ │
│  │         Next.js 14 + React 18 + TypeScript              │ │
│  └────────────────────┬───────────────────────────────────┘ │
│                       │                                      │
│  ┌────────────────────▼───────────────────────────────────┐ │
│  │              API & Orchestration Layer                  │ │
│  │         FastAPI + WebSocket Server                      │ │
│  │  ┌──────────────────────────────────────────────────┐  │ │
│  │  │  • Enhanced AI Workflow Router                    │  │ │
│  │  │  • Advanced Workflow Orchestrator                 │  │ │
│  │  │  • Background Agent Runner                        │  │ │
│  │  │  • Batch Sync Workers                             │  │ │
│  │  └──────────────────────────────────────────────────┘  │ │
│  └────────────────────┬───────────────────────────────────┘ │
│                       │                                      │
│  ┌────────────────────▼───────────────────────────────────┐ │
│  │              LLM & Intelligence Tier                    │ │
│  │  ┌──────────────────────────────────────────────────┐  │ │
│  │  │  • LLMService (Cognitive Entry Point)             │  │ │
│  │  │  • BYOKHandler (Tenant Isolation - disabled)      │  │ │
│  │  │  • Cognitive Tiering (5-Tier Logic)               │  │ │
│  │  │  • ReflectionPool (Mistake Storage)               │  │ │
│  │  │  • GraduationService (Performance Review)         │  │ │
│  │  └──────────────────────────────────────────────────┘  │ │
│  └────────────────────┬───────────────────────────────────┘ │
│                       │                                      │
│  ┌────────────────────▼───────────────────────────────────┐ │
│  │              Tool Protocol (MCP)                        │ │
│  │         MCP Service + Tool Search                       │ │
│  └────────────────────┬───────────────────────────────────┘ │
│                       │                                      │
│  ┌────────────────────▼───────────────────────────────────┐ │
│  │           Core Integration Services                     │ │
│  │  CRM | Finance | Comm | Storage | Smart Home | Media   │ │
│  └────────────────────┬───────────────────────────────────┘ │
│                       │                                      │
│  ┌────────────────────▼───────────────────────────────────┐ │
│  │              Persistence Layer                          │ │
│  │  ┌──────────┐  ┌──────────┐  ┌──────────┐            │ │
│  │  │PostgreSQL│  │LanceDB   │  │Valkey    │            │ │
│  │  │or SQLite │  │(Vector)  │  │(Redis)   │            │ │
│  │  └──────────┘  └──────────┘  └──────────┘            │ │
│  └────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘

**Key Characteristics:**

**Single execution path**: Direct from UI → API → LLM → Integrations
**No tenant isolation**: All data belongs to single tenant
**Flexible database**: SQLite (Personal) or PostgreSQL (Enterprise)
**Local-first**: Optimized for local execution with optional cloud integrations
**BYOK model**: Bring Your Own Key for all AI providers

Main Repo (SaaS)

┌─────────────────────────────────────────────────────────────┐
│                  Multi-Tenant Platform                        │
│                                                              │
│  ┌────────────────────────────────────────────────────────┐ │
│  │              Presentation Layer                          │ │
│  │  ┌──────────────┐         ┌──────────────┐             │ │
│  │  │ Next.js Web  │         │ Tauri Desktop│             │ │
│  │  │   :3000      │         │   Local      │             │ │
│  │  └──────┬───────┘         └──────┬───────┘             │ │
│  └─────────┼────────────────────────┼─────────────────────┘ │
│            │                        │                        │
│  ┌─────────▼────────────────────────▼─────────────────────┐ │
│  │              API Gateway Layer                           │ │
│  │  ┌──────────────────────────────────────────────────┐  │ │
│  │  │  Next.js API Routes + FastAPI Backend (:8000)    │  │ │
│  │  │  • Subdomain-based tenant routing                 │  │ │
│  │  │  • Custom domain support                          │  │ │
│  │  │  • Tenant context extraction                      │  │ │
│  │  └──────────────────────────────────────────────────┘  │ │
│  └────────────────────┬───────────────────────────────────┘ │
│                       │                                      │
│  ┌────────────────────▼───────────────────────────────────┐ │
│  │           Brain Systems Layer (6 Core Systems)          │ │
│  │  ┌──────────────────────────────────────────────────┐  │ │
│  │  │  1. Cognitive Architecture (Reasoning & Memory)   │  │ │
│  │  │  2. Learning Engine (Adaptation & RLHF)           │  │ │
│  │  │  3. World Model (Long-term Memory + pgvector)     │  │ │
│  │  │  4. Reasoning Engine (Proactive Intelligence)     │  │ │
│  │  │  5. Cross-System Reasoning (Multi-System Corr.)   │  │ │
│  │  │  6. Agent Governance (Permissions & Safety)       │  │ │
│  │  └──────────────────────────────────────────────────┘  │ │
│  └────────────────────┬───────────────────────────────────┘ │
│                       │                                      │
│  ┌────────────────────▼───────────────────────────────────┐ │
│  │           Skill Execution Layer                         │ │
│  │  • Skill Registry (Dynamic Loading)                     │ │
│  │  • Skill Executor (Action Execution)                    │ │
│  │  • Agent Runner (Orchestration)                         │ │
│  └────────────────────┬───────────────────────────────────┘ │
│                       │                                      │
│  ┌────────────────────▼───────────────────────────────────┐ │
│  │           Integration Layer                             │ │
│  │  • 39+ OAuth Integrations (CRM, Comm, Storage)          │ │
│  │  • MCP Server (Interoperability)                        │ │
│  │  • Tenant-scoped credentials                            │ │
│  └────────────────────┬───────────────────────────────────┘ │
│                       │                                      │
│  ┌────────────────────▼───────────────────────────────────┐ │
│  │           Data Layer (Multi-Tenant Isolation)           │ │
│  │  ┌──────────────────────────────────────────────────┐  │ │
│  │  │  PostgreSQL + RLS (Row-Level Security)           │  │ │
│  │  │  • Tenant isolation via current_setting()        │  │ │
│  │  │  • All tables include tenant_id column           │  │ │
│  │  │  • Automatic policy creation                     │  │ │
│  │  └──────────────────────────────────────────────────┘  │ │
│  │  ┌──────────┐  ┌──────────┐  ┌──────────┐            │ │
│  │  │LanceDB   │  │Redis     │  │AWS S3    │            │ │
│  │  │(Tenant)  │  │(Namespace│  │(Prefix   │            │ │
│  │  │          │  │Scoped)   │  │Isolation)│            │ │
│  │  └──────────┘  └──────────┘  └──────────┘            │ │
│  └────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘

**Key Characteristics:**

**Multi-tenant isolation**: Database, storage, and cache isolation
**Subdomain routing**: Automatic tenant context extraction
**Row-Level Security**: PostgreSQL RLS policies on all tables
**SaaS billing**: Stripe integration with usage metering
**Enterprise governance**: Maturity-based permissions, audit logging

---

1.2 Execution Paths

atom-upstream: Three Independent Paths

**Paths:**

**Direct Real-Time**: Chat UI → Router → LLMService → Integrations (low latency)
**Autonomous (Scheduled)**: BackgroundRunner → LLMService → Integrations (no human intervention)
**Coordination (Workflows)**: Orchestrator → Multi-step workflow → LLMService → Integrations

Main Repo (SaaS): Single Path with Governance

**Path:**

**Governed Execution**: User Input → Tenant Context → Governance → Brain Systems → Skill Execution → Integrations → Learning

**Key Difference:** SaaS has mandatory governance checks before every action

---

2. Database Architecture

2.1 Schema Comparison

atom-upstream (Single Tenant)

**Database Options:**

**SQLite**: Personal Edition (default, embedded)
**PostgreSQL**: Enterprise Edition (production)

**Key Tables:** (8,524 lines of models)

-- No tenant_id columns (single tenant)
CREATE TABLE agents (
    id UUID PRIMARY KEY,
    name VARCHAR(255),
    maturity_level VARCHAR(50),
    cognitive_profile JSONB,
    created_at TIMESTAMP
);

CREATE TABLE agent_executions (
    id UUID PRIMARY KEY,
    agent_id UUID REFERENCES agents(id),
    task_description TEXT,
    outcome TEXT,
    outcome_score FLOAT,
    created_at TIMESTAMP
);

CREATE TABLE integrations (
    id UUID PRIMARY KEY,
    integration_type VARCHAR(100),
    credentials_encrypted TEXT,
    created_at TIMESTAMP
);

-- No RLS policies
-- No tenant isolation

**Characteristics:**

❌ No tenant_id columns
❌ No Row-Level Security
❌ No tenant isolation
✅ Simpler queries (no WHERE tenant_id = ?)
✅ Better performance (no RLS overhead)

Main Repo (Multi-Tenant)

**Database:** PostgreSQL 15+ with pgvector

**Key Tables:** (9,464 lines of models)

-- All tables include tenant_id
CREATE TABLE tenants (
    id UUID PRIMARY KEY,
    name VARCHAR(255),
    subdomain VARCHAR(100) UNIQUE,
    custom_domain VARCHAR(255) UNIQUE,
    edition VARCHAR(50),  -- 'personal' or 'enterprise'
    plan_type VARCHAR(50),
    stripe_customer_id VARCHAR(255),
    subscription_status VARCHAR(50),
    created_at TIMESTAMP
);

CREATE TABLE agents (
    id UUID PRIMARY KEY,
    tenant_id UUID REFERENCES tenants(id),  -- ← Multi-tenant
    name VARCHAR(255),
    maturity_level VARCHAR(50),
    cognitive_profile JSONB,
    created_at TIMESTAMP,
    INDEX idx_tenant_agents (tenant_id)
);

CREATE TABLE agent_executions (
    id UUID PRIMARY KEY,
    tenant_id UUID REFERENCES tenants(id),  -- ← Multi-tenant
    agent_id UUID REFERENCES agents(id),
    task_description TEXT,
    outcome TEXT,
    outcome_score FLOAT,
    created_at TIMESTAMP,
    INDEX idx_tenant_executions (tenant_id)
);

-- Row-Level Security enabled
ALTER TABLE agents ENABLE ROW LEVEL SECURITY;

CREATE POLICY tenant_isolation ON agents
  FOR ALL
  USING (tenant_id = current_setting('app.current_tenant_id')::UUID);

-- 9+ tables with RLS policies

**Additional SaaS Tables:**

tenants - Tenant metadata and billing
tenant_settings - Key-value settings per tenant
workspaces - Workspace isolation within tenant
subscriptions - Stripe subscription tracking
saas_tiers - Pricing tier definitions
usage_events - Usage metering for billing
audit_logs - Multi-tenant audit trail

**Characteristics:**

✅ tenant_id on all tables
✅ Row-Level Security (RLS)
✅ Automatic tenant isolation
✅ S3 prefix isolation (s3://bucket/{tenant_id}/)
✅ Redis namespace separation (tenant:{tenant_id}:key)
❌ More complex queries
❌ RLS performance overhead (~5-10%)

---

2.2 Tenant Context Management

atom-upstream

# tenant_context.py - Simplified for single tenant
# File NOT FOUND in upstream - tenant context not needed

# Single tenant = no context extraction required
# All data implicitly belongs to the single tenant

**Result:** No tenant context management needed

Main Repo (SaaS)

# core/tenant_context.py (200+ lines)

import contextvars
from fastapi import Request
from sqlalchemy.orm import Session

# Context variable for tenant ID
_current_tenant_id = contextvars.ContextVar("current_tenant_id", default=None)

class TenantContext:
    """Context manager for tenant isolation"""
    
    def __enter__(self):
        self.token = _current_tenant_id.set(self.tenant_id)
        return self
    
    def __exit__(self, exc_type, exc_val, exc_tb):
        _current_tenant_id.reset(self.token)

def get_safe_tenant_context(request: Request) -> Dict[str, Any]:
    """
    Safely extract tenant context from request with DB validation.
    Prevents tenant enumeration attacks.
    """
    # 1. Extract tenant_id from request state (set by middleware)
    tenant_id = getattr(request.state, "tenant_id", None)
    
    # 2. Get database session
    db = get_db_from_request(request)
    
    # 3. Validate tenant exists in database
    if validate_tenant_exists(db, tenant_id):
        return {"tenant_resolved": True, "tenant_id": tenant_id}
    else:
        # Log security event
        log_tenant_enumeration_attempt(request, tenant_id)
        return {"tenant_resolved": False}

# Usage in API routes
@app.get("/api/agents")
async def get_agents(request: Request, db: Session = Depends(get_db)):
    # Extract and validate tenant context
    tenant_context = get_safe_tenant_context(request)
    
    if not tenant_context["tenant_resolved"]:
        raise HTTPException(401, "Tenant context required")
    
    # All queries automatically scoped to tenant
    agents = db.query(Agent).filter(
        Agent.tenant_id == tenant_context["tenant_id"]
    ).all()
    
    return agents

**Tenant Resolution Priority:**

X-Tenant-ID header
JWT token tenant_id claim
Subdomain from Host header
User's tenant_id attribute

---

2.3 Row-Level Security (RLS) Implementation

atom-upstream

-- NO RLS policies
-- Single tenant = no isolation needed

-- Example query (simple)
SELECT * FROM agents WHERE id = '...';

Main Repo (SaaS)

-- Enable RLS on all tenant-scoped tables
ALTER TABLE agents ENABLE ROW LEVEL SECURITY;
ALTER TABLE agent_executions ENABLE ROW LEVEL SECURITY;
ALTER TABLE integrations ENABLE ROW LEVEL SECURITY;
-- ... 9+ tables total

-- Create isolation policies
CREATE POLICY tenant_isolation ON agents
  FOR ALL
  USING (tenant_id = current_setting('app.current_tenant_id')::UUID);

-- Set tenant context at session start
SET app.current_tenant_id = '550e8400-e29b-41d4-a716-446655440000';

-- Example query (automatically filtered by RLS)
SELECT * FROM agents WHERE id = '...';
-- PostgreSQL automatically adds: AND tenant_id = current_setting('app.current_tenant_id')

**RLS Policy Coverage:**

agents - Agent definitions
agent_executions - Execution history
integrations - OAuth credentials
tenant_settings - Tenant configuration
workspaces - Workspace isolation
users - User accounts
audit_logs - Audit trail
subscriptions - Billing data
usage_events - Usage metering

---

3. Brain Systems Architecture

3.1 Brain Systems Overview

Both repositories share the same **6 core brain systems**, but with different implementations:

System	atom-upstream	Main Repo (SaaS)
Cognitive Architecture	✅ Full	✅ Full + Tenant context
Learning Engine	✅ Full	✅ Full + Multi-tenant learning
World Model	✅ Full	✅ Full + Tenant isolation
Reasoning Engine	✅ Full	✅ Full + Governance checks
Cross-System Reasoning	✅ Full	✅ Full + Tenant-scoped integrations
Agent Governance	✅ Basic	✅ Enhanced + RBAC

---

3.2 Brain System Differences

Cognitive Architecture

**atom-upstream:**

// Direct reasoning without tenant context
const result = await cognitive.reason(agentId, task, {
  memories,
  constraints
});

**Main Repo (SaaS):**

// Tenant-aware reasoning with governance
const result = await cognitive.reason(tenantId, agentId, task, {
  memories,
  constraints,
  governance: decision,  // ← Mandatory governance check
  tenant_settings: settings  // ← Tenant-specific config
});

---

World Model (Memory System)

**atom-upstream:**

# Single-tenant memory storage
async def record_experience(experience: Experience):
    # Store in LanceDB without tenant_id
    await lancedb.add({
        "agent_id": experience.agent_id,
        "task_type": experience.task_type,
        "embedding": embedding,
        "metadata": experience.metadata
    })

**Main Repo (SaaS):**

# Multi-tenant memory isolation
async def record_experience(tenant_id: str, experience: Experience):
    # Store with tenant isolation
    await lancedb.add({
        "tenant_id": tenant_id,  # ← Tenant isolation
        "agent_id": experience.agent_id,
        "task_type": experience.task_type,
        "embedding": embedding,
        "metadata": experience.metadata
    })
    
    # S3 path includes tenant_id
    s3_path = f"s3://atom-saas/{tenant_id}/lancedb/{experience.id}"

---

Learning Engine

**atom-upstream:**

# Single-tenant learning
async def generate_adaptations(agent_id: str):
    # Analyze all experiences for this agent
    experiences = await get_agent_experiences(agent_id)
    patterns = await identify_patterns(experiences)
    return generate_adaptations(patterns)

**Main Repo (SaaS):**

# Multi-tenant learning with isolation
async def generate_adaptations(tenant_id: str, agent_id: str):
    # Analyze only tenant-scoped experiences
    experiences = await get_agent_experiences(tenant_id, agent_id)
    patterns = await identify_patterns(experiences)
    return generate_adaptations(patterns)
    
    # Cross-agent learning within tenant (optional)
    if tenant_settings.collaborative_memory:
        tenant_agents = await get_tenant_agents(tenant_id)
        shared_patterns = await identify_cross_agent_patterns(tenant_agents)

---

Agent Governance

**atom-upstream:**

# Basic governance (maturity levels only)
async def can_perform_action(agent_id: str, action_type: str):
    agent = await get_agent(agent_id)
    
    maturity_requirements = {
        "READ": "student",
        "CREATE": "supervised",
        "DELETE": "autonomous"
    }
    
    if agent.maturity_level < maturity_requirements[action_type]:
        return {"allowed": False, "reason": "Insufficient maturity"}
    
    return {"allowed": True}

**Main Repo (SaaS):**

# Enhanced governance with RBAC and tenant isolation
async def can_perform_action(tenant_id: str, agent_id: str, action_type: str):
    # 1. Check tenant exists and is active
    tenant = await get_tenant(tenant_id)
    if not tenant or not tenant.is_active:
        return {"allowed": False, "reason": "Tenant not found or inactive"}
    
    # 2. Check tenant edition features
    if action_type == "BULK_OPERATIONS" and tenant.edition == "personal":
        return {"allowed": False, "reason": "Feature requires Enterprise edition"}
    
    # 3. Check agent maturity
    agent = await get_agent(tenant_id, agent_id)  # ← Tenant-scoped query
    if agent.maturity_level < MATURITY_REQUIREMENTS[action_type]:
        return {"allowed": False, "reason": "Insufficient maturity"}
    
    # 4. Check rate limits (per-tenant)
    rate_limit = await get_tenant_rate_limit(tenant_id)
    if await exceeds_rate_limit(tenant_id, rate_limit):
        return {"allowed": False, "reason": "Rate limit exceeded"}
    
    # 5. Check user permissions (RBAC)
    user = await get_current_user()
    if not await rbac.has_permission(user, action_type):
        return {"allowed": False, "reason": "Insufficient permissions"}
    
    # 6. Log for audit
    await log_audit_event({
        "tenant_id": tenant_id,
        "agent_id": agent_id,
        "action": action_type,
        "result": "allowed"
    })
    
    return {"allowed": True}

**Governance Features Comparison:**

Feature	atom-upstream	Main Repo (SaaS)
Maturity Levels	✅ 4 levels	✅ 4 levels + RBAC
Permission Checks	✅ Basic	✅ Enhanced (40+ permissions)
Rate Limiting	❌ None	✅ Per-tenant quotas
Audit Logging	✅ Basic	✅ Multi-tenant audit trail
Tenant Validation	❌ N/A	✅ Mandatory
Edition Checks	❌ N/A	✅ Personal vs Enterprise
Budget Enforcement	❌ N/A	✅ 4-mode enforcement

---

4. Integration Architecture

4.1 Integration Coverage

atom-upstream: 46+ Integrations

**Categories:**

**AI Providers** (5): OpenAI, Anthropic, DeepSeek, Google, GLM
**Communication** (6): Slack, Discord, WhatsApp, Telegram, Teams, Twilio
**CRM** (3): Salesforce, HubSpot, Zoho
**Email** (2): Gmail, SendGrid
**Project Management** (6): GitHub, GitLab, Jira, Linear, Notion, Asana
**Storage** (5): Google Drive, OneDrive, Dropbox, Box, S3
**Smart Home** (4): Philips Hue, Home Assistant, Sonos, FFmpeg
**Media** (3): Spotify, YouTube, FFmpeg
**Finance** (4): Stripe, QuickBooks, Xero, Plaid
**Customer Support** (3): Zendesk, Intercom, Help Scout
**HR** (2): LinkedIn, Indeed
**Marketing** (3): Mailchimp, SendGrid, HubSpot
**Security** (2): 1Password, LastPass
**Analytics** (2): Mixpanel, Amplitude
**DevOps** (4): PagerDuty, Statuspage, Datadog, New Relic

**Implementation:**

# OAuth flow with local storage
class IntegrationService:
    async def connect_integration(provider: str, credentials: dict):
        # Encrypt credentials
        encrypted = encrypt_credentials(credentials)
        
        # Store in integrations table (no tenant_id)
        await db.execute("""
            INSERT INTO integrations (integration_type, credentials_encrypted)
            VALUES (%s, %s)
        """, provider, encrypted)

Main Repo (SaaS): ~10 Integrations

**Categories:**

**CRM** (3): Salesforce, HubSpot, Zoho ✅
**Communication** (partial): Slack, Google, Microsoft ✅
**Email** (1): SendGrid ✅
**Finance** (1): Stripe ✅ (billing integration)
**Storage** (1): AWS S3 ✅

**Missing:** 35+ integrations from upstream

**Implementation:**

# Multi-tenant OAuth with tenant isolation
class IntegrationService:
    async def connect_integration(tenant_id: str, provider: str, credentials: dict):
        # Validate tenant exists
        tenant = await get_tenant(tenant_id)
        if not tenant:
            raise TenantNotFoundError()
        
        # Encrypt with tenant-specific key
        encrypted = encrypt_credentials(credentials, tenant_id)
        
        # Store with tenant isolation
        await db.execute("""
            INSERT INTO integrations (tenant_id, integration_type, credentials_encrypted)
            VALUES (%s, %s, %s)
        """, tenant_id, provider, encrypted)
        
        # Log audit event
        await log_audit_event({
            "tenant_id": tenant_id,
            "action": "integration_connected",
            "provider": provider
        })

---

4.2 Local-Only Mode

atom-upstream

**Feature:** Privacy-focused deployment that blocks all cloud services

# .env configuration
ATOM_LOCAL_ONLY=true  # Blocks Spotify, Notion, etc.

**Behavior:**

✅ Local services continue working: Hue, Home Assistant, FFmpeg, Sonos
❌ Cloud services blocked: Spotify, Notion, cloud-based OAuth
✅ All data stays local
✅ No outbound connections (except AI providers if configured)

**Implementation:**

# Integration gateway with local-only check
class IntegrationGateway:
    async def call_integration(provider: str, action: str):
        # Check local-only mode
        if os.getenv("ATOM_LOCAL_ONLY") == "true":
            if provider in CLOUD_SERVICES:
                raise IntegrationBlockedError(
                    f"{provider} is blocked in local-only mode"
                )
        
        # Proceed with integration call
        return await execute_integration(provider, action)

Main Repo (SaaS)

**Status:** ❌ Not implemented

**Reason:** SaaS model requires cloud connectivity for multi-tenant features

---

5. Deployment Architecture

5.1 Deployment Options

atom-upstream

**5 Deployment Methods:**

**Docker Compose (Recommended)**

**Services:**

atom-backend (FastAPI + SQLite)
valkey (Redis-compatible cache)
atom-frontend (Next.js)
browser-node (Browser automation)

**pip install**

**Native Installation**

**DigitalOcean 1-Click**

Pre-configured droplet
Automated setup script

**ATOM Cloud**

Main Repo (SaaS)

**1 Deployment Method:**

**ATOM Cloud (SaaS-focused)**

# Backend API

atom-cli deploy --service api

```

**Services:**

Next.js web app (Cloud managed)
FastAPI backend (Managed nodes)
PostgreSQL (Neon serverless - external)
Redis (Upstash - external)

**Missing:**

❌ Docker Compose for SaaS (only basic backend-saas/docker-compose.prod.yml)
❌ pip install package
❌ Native installation scripts
❌ DigitalOcean/AWS deployment guides
❌ Self-hosting documentation

---

5.2 Docker Configuration Comparison

atom-upstream: Comprehensive Docker Compose

**docker-compose-personal.yml (250+ lines):**

version: '3.8'

services:
  atom-backend:
    build:
      context: ./backend
      dockerfile: Dockerfile
    container_name: atom-personal-backend
    ports:
      - "8000:8000"
    environment:
      - DATABASE_URL=sqlite:///./data/atom.db
      - SQLITE_PATH=./data/atom.db
      - ENVIRONMENT=development
      - LOG_LEVEL=INFO
      - OPENAI_API_KEY=${OPENAI_API_KEY:-}
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:-}
      - BYOK_ENCRYPTION_KEY=${BYOK_ENCRYPTION_KEY:-}
      - JWT_SECRET_KEY=${JWT_SECRET_KEY:-}
      - LANCEDB_PATH=./data/lancedb
      - ENABLE_LANCEDB=true
      - EMBEDDING_PROVIDER=fastembed
      - REDIS_URL=redis://valkey:6379
      
      # Phase 66: Personal Edition Enhancements
      - ATOM_LOCAL_ONLY=${ATOM_LOCAL_ONLY:-false}
      - SPOTIFY_CLIENT_ID=${SPOTIFY_CLIENT_ID:-}
      - SPOTIFY_CLIENT_SECRET=${SPOTIFY_CLIENT_SECRET:-}
      - HUE_BRIDGE_IP=${HUE_BRIDGE_IP:-}
      - HOME_ASSISTANT_URL=${HOME_ASSISTANT_URL:-}
      - NOTION_CLIENT_ID=${NOTION_CLIENT_ID:-}
      - FFMPEG_ALLOWED_DIRS=/app/data/media,/app/data/exports
      
      # Security & Audit
      - AUDIT_LOG_PATH=logs/audit.log
      - AUDIT_LOG_RETENTION_DAYS=${AUDIT_LOG_RETENTION_DAYS:-90}
    volumes:
      - ./data:/app/data
      - ./backend:/app  # Hot reload
    working_dir: /app
    command: uvicorn main_api_app:app --host 0.0.0.0 --port 8000 --reload
    restart: unless-stopped
    depends_on:
      valkey:
        condition: service_healthy
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health/live"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s
    networks:
      - atom-local
    extra_hosts:
      - "host.docker.internal:host-gateway"

  valkey:
    image: valkey/valkey:latest
    container_name: atom-personal-valkey
    ports:
      - "6379:6379"
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 10s
      timeout: 5s
      retries: 5
    restart: unless-stopped
    networks:
      - atom-local

  atom-frontend:
    build:
      context: ./frontend-nextjs
      dockerfile: Dockerfile
    container_name: atom-personal-frontend
    ports:
      - "3000:3000"
    environment:
      - NEXT_PUBLIC_API_URL=http://localhost:8000
      - NODE_ENV=development
    volumes:
      - ./frontend-nextjs:/app
      - /app/node_modules
      - /app/.next
    working_dir: /app
    command: npm run dev
    restart: unless-stopped
    depends_on:
      - atom-backend
      - valkey
    networks:
      - atom-local

  browser-node:
    image: browserless/chrome:latest
    container_name: atom-personal-browser
    ports:
      - "3001:3000"
    environment:
      - MAX_CONCURRENT_SESSIONS=3
      - MAX_QUEUE_LENGTH=5
      - PREBOOT_CHROME=true
    restart: unless-stopped
    networks:
      - atom-local

networks:
  atom-local:
    driver: bridge
    internal: false
    ipam:
      config:
        - subnet: 172.28.0.0/16

volumes:
  atom-personal-data:
    driver: local

**Features:**

✅ 4 services (backend, cache, frontend, browser)
✅ Hot reload for development
✅ Health checks on all services
✅ Volume persistence
✅ Network isolation
✅ Local-only mode support
✅ 46+ integration credentials
✅ Audit logging configuration

Main Repo (SaaS): Basic Docker Compose

**backend-saas/docker-compose.prod.yml (50 lines):**

version: '3.8'

services:
  db:
    image: postgres:15-alpine
    restart: always
    environment:
      POSTGRES_USER: ${POSTGRES_USER:-atom}
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-atom_password}
      POSTGRES_DB: ${POSTGRES_DB:-atom_db}
    volumes:
      - postgres_data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U atom"]
      interval: 10s
      timeout: 5s
      retries: 5

  redis:
    image: redis:7-alpine
    restart: always
    volumes:
      - redis_data:/data
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 10s
      timeout: 5s
      retries: 5

  api:
    build: .
    restart: always
    depends_on:
      db:
        condition: service_healthy
      redis:
        condition: service_healthy
    environment:
      DATABASE_URL: postgresql://${POSTGRES_USER}:${POSTGRES_PASSWORD}@db:5432/${POSTGRES_DB}
      REDIS_URL: redis://redis:6379/0
      SECRET_KEY: ${SECRET_KEY}
      OPENAI_API_KEY: ${OPENAI_API_KEY}
      ANTHROPIC_API_KEY: ${ANTHROPIC_API_KEY}
      ENVIRONMENT: production
    ports:
      - "8000:8000"
    volumes:
      - ./brain_artifacts:/app/brain_artifacts

volumes:
  postgres_data:
  redis_data:

**Missing:**

❌ Frontend service
❌ Browser automation
❌ Development mode
❌ Integration credentials
❌ Monitoring stack
❌ Backup automation
❌ Health checks (minimal)

---

5.3 Environment Configuration

atom-upstream: Comprehensive .env.example (323 lines)

**19 Sections:**

Core Configuration (NODE_ENV, LOG_LEVEL)
Database & Storage (SQLite/PostgreSQL, LanceDB)
Authentication & Security (REQUIRED keys)
AI Service Credentials (5 providers)
Communication Integrations (6 platforms)
Google Services (OAuth)
Project Management (6 tools)
CRM Integrations (3 platforms)
Storage & Cloud (5 providers)
Smart Home (4 platforms)
Media & Entertainment (3 platforms)
Development & DevOps (4 tools)
Finance & Accounting (4 platforms)
Customer Support (3 platforms)
HR & Recruitment (2 platforms)
Marketing & SEO (3 platforms)
Security & Compliance (2 tools)
Analytics & BI (2 platforms)
Other Integrations (5+ platforms)

**Total:** 46+ integration providers documented

Main Repo (SaaS): Limited .env.example (237 lines)

**6 Sections:**

Core Platform (basic URLs)
Billing (Stripe keys)
Email (SendGrid/SES)
AWS Services (S3, credentials)
CRM Integrations (3 platforms)
Communication (partial)

**Missing:**

❌ AI provider credentials
❌ 35+ integration credentials
❌ Smart home integrations
❌ Media integrations
❌ Project management tools
❌ DevOps tools
❌ Analytics platforms

---

6. Security Architecture

6.1 Multi-Layer Security

atom-upstream

**Security Layers:**

┌─────────────────────────────────────┐
│  Network Security                   │
│  • TLS 1.3                          │
│  • DDoS Protection (Global Edge)    │
└──────────────┬──────────────────────┘
               │
┌──────────────▼──────────────────────┐
│  Authentication                     │
│  • JWT (24-hour expiry)             │
│  • OAuth2 (Google, Okta, Auth0)     │
│  • Mobile biometric support         │
└──────────────┬──────────────────────┘
               │
┌──────────────▼──────────────────────┐
│  Data Encryption                    │
│  • Fernet symmetric encryption      │
│  • BYOK (Bring Your Own Key)        │
│  • Per-user credential encryption   │
│  • Secrets migration (plaintext→encrypted) │
└──────────────┬──────────────────────┘
               │
┌──────────────▼──────────────────────┐
│  Audit Logging                      │
│  • All credential access logged     │
│  • 90-day retention                 │
│  • Security event tracking          │
└─────────────────────────────────────┘

**Security Rating:** A- (documented audit)

Main Repo (SaaS)

**Security Layers:**

┌─────────────────────────────────────┐
│  Network Security                   │
│  • TLS 1.3                          │
│  • DDoS Protection (Global Edge)    │
│  • IP whitelisting (enterprise)     │
└──────────────┬──────────────────────┘
               │
┌──────────────▼──────────────────────┐
│  Authentication                     │
│  • JWT with tenant context          │
│  • OAuth 2.0 for integrations       │
│  • API key support (BYOK)           │
└──────────────┬──────────────────────┘
               │
┌──────────────▼──────────────────────┐
│  Tenant Isolation                   │
│  • Subdomain-based routing          │
│  • Row-Level Security (PostgreSQL)  │
│  • S3 prefix isolation              │
│  • Redis namespace separation       │
└──────────────┬──────────────────────┘
               │
┌──────────────▼──────────────────────┐
│  Agent Governance                   │
│  • Maturity-based permissions       │
│  • Real-time permission validation  │
│  • Constitutional guardrails        │
│  • Comprehensive audit logging      │
└──────────────┬──────────────────────┘
               │
┌──────────────▼──────────────────────┐
│  Abuse Protection                   │
│  • Per-tenant rate limits           │
│  • Resource quotas                  │
│  • Anomaly detection                │
│  • Automatic throttling             │
└─────────────────────────────────────┘

**Additional SaaS Security:**

✅ Tenant enumeration prevention
✅ Tenant isolation validation
✅ Cross-tenant access prevention
✅ Multi-tenant audit trail
✅ RBAC with 40+ permissions
✅ Budget enforcement (4 modes)

---

6.2 Tenant Enumeration Prevention

atom-upstream

**Status:** ❌ Not applicable (single tenant)

Main Repo (SaaS)

**Implementation:**

# core/tenant_context.py

def validate_tenant_exists(db: Session, tenant_id: str) -> bool:
    """
    Validate that a tenant exists in the database.
    
    Prevents tenant enumeration attacks by only returning boolean.
    """
    try:
        from core.models import Tenant
        
        # Query tenant by ID (no error messages that leak info)
        tenant = db.query(Tenant).filter(
            Tenant.id == tenant_id
        ).first()
        
        # Check if tenant exists and is active
        return tenant is not None and tenant.is_active
    except Exception as e:
        logger.error(f"Error validating tenant existence: {e}")
        return False  # Always return False on error (no leakage)


def get_safe_tenant_context(request: Request) -> Dict[str, Any]:
    """
    Safely extract tenant context with database validation.
    
    Prevents tenant enumeration attacks.
    """
    result = {
        "tenant_resolved": False,
        "tenant_id": None
    }
    
    # Extract tenant_id from request state
    tenant_id = getattr(request.state, "tenant_id", None)
    
    if not tenant_id:
        logger.debug("No tenant_id in request.state")
        return result
    
    # Get database session
    db = get_db_from_request(request)
    if not db:
        logger.warning("No database session in request scope")
        return result
    
    # Validate tenant exists in database
    if validate_tenant_exists(db, tenant_id):
        result["tenant_resolved"] = True
        result["tenant_id"] = tenant_id
        logger.debug(f"Tenant context validated: {tenant_id}")
    else:
        # Log security event for invalid tenant_id
        logger.warning(
            f"Invalid tenant_id in request.state: {tenant_id[:8]}... "
            f"(potential enumeration attempt)"
        )
        
        # Import and call security event logger
        try:
            from core.security import log_tenant_enumeration_attempt
            log_tenant_enumeration_attempt(request, tenant_id)
        except ImportError:
            logger.warning("Unable to log security event")
    
    return result

**Security Benefits:**

✅ No error messages that leak tenant existence
✅ Boolean return only (no detailed errors)
✅ Security event logging for enumeration attempts
✅ Consistent response time (no timing attacks)

---

7. Monitoring & Operations

7.1 Monitoring Stack

atom-upstream

**Personal Edition:**

Basic logging
Health checks on all services
Service status monitoring

**Enterprise Edition:**

Prometheus metrics collection
Grafana dashboards
Alert configuration
Performance monitoring

**Scripts:**

# production/monitoring.sh
./scripts/monitor_production.sh
./scripts/backup.sh
./scripts/database_migrations.sh

Main Repo (SaaS)

**Current:**

ATOM Cloud managed monitoring
Basic health endpoints
Error logging (Sentry)

**Missing:**

❌ Prometheus setup
❌ Grafana dashboards
❌ Alert configuration
❌ Backup automation scripts
❌ Performance monitoring tools

---

7.2 Backup & Recovery

atom-upstream

**Backup Script:**

#!/bin/bash
# production/backup.sh

# Create timestamped SQL dump
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
BACKUP_FILE="backups/atom_backup_${TIMESTAMP}.sql"

docker-compose exec postgres pg_dump -U atom atom_db > $BACKUP_FILE

# Compress backup
gzip $BACKUP_FILE

# 7-day retention policy
find backups/ -name "*.sql.gz" -mtime +7 -delete

echo "Backup completed: ${BACKUP_FILE}.gz"

**Recovery:**

# Restore from backup
gunzip backups/atom_backup_20260331_120000.sql.gz
docker-compose exec -T postgres psql -U atom atom_db < backups/atom_backup_20260331_120000.sql

Main Repo (SaaS)

**Current:**

ATOM Cloud automatic backups (managed service)
Neon PostgreSQL automatic backups (managed)
Upstash Redis automatic backups (managed)

**Missing:**

❌ Manual backup scripts
❌ Documented restore procedures
❌ Backup retention policies
❌ Disaster recovery runbooks

---

8. Testing & Quality

8.1 Test Coverage

atom-upstream

**Test Suites:**

Unit tests (Python, TypeScript)
Integration tests
E2E tests (Playwright)
Property-based tests
Cross-platform tests

**Configuration:**

# docker-compose-e2e.yml
services:
  playwright:
    image: mcr.microsoft.com/playwright
    volumes:
      - ./tests:/app/tests
    command: pytest tests/e2e/ -v

**Coverage Reports:**

HTML coverage reports
XML coverage reports
Final test reports

Main Repo (SaaS)

**Test Suites:**

Unit tests (partial)
Integration tests (partial)
E2E tests (minimal - 212 tests)

**Missing:**

❌ Comprehensive E2E suite
❌ Property-based tests
❌ Cross-platform tests
❌ Coverage reports

---

9. CLI & Developer Tools

9.1 Command-Line Interface

atom-upstream

**Atom CLI:**

# Installation
pip install atom-os

# Commands
atom init              # Initialize installation
atom start            # Start services
atom daemon           # Start background service
atom status          # Check daemon status
atom stop            # Stop services
atom enable enterprise  # Upgrade to Enterprise edition

**Implementation:**

# atom-cli/commands/init.py
@click.command()
def init():
    """Initialize Atom installation"""
    create_directories()
    generate_env_file()
    initialize_database()
    print("✅ Atom initialized successfully!")

Main Repo (SaaS)

**Status:** ❌ No CLI

**Manual Setup:**

# Manual installation
git clone <repo>
cd atom-saas
npm install
cd backend-saas
pip install -r requirements.txt
# Manual environment configuration
# Manual database setup

---

10. Summary & Recommendations

10.1 Architecture Comparison Summary

Aspect	atom-upstream	Main Repo (SaaS)	Gap
Tenancy Model	Single-tenant	Multi-tenant + RLS	✅ Intentional
Database	SQLite/PostgreSQL	PostgreSQL + pgvector	✅ Intentional
Integrations	46+ providers	~10 providers	⚠️ 78% gap
Deployment Options	5 methods	1 method (Cloud)	⚠️ 80% gap
Docker Compose	Comprehensive (250+ lines)	Basic (50 lines)	⚠️ Significant
Environment Config	323 lines, 19 sections	237 lines, 6 sections	⚠️ 27% gap
Deployment Scripts	8 automation scripts	2 basic scripts	⚠️ 75% gap
Documentation	654 lines (INSTALLATION.md)	~200 lines	⚠️ 70% gap
CLI Tools	Full CLI (pip installable)	None	⚠️ Complete gap
Monitoring	Prometheus + Grafana	Cloud managed only	⚠️ Significant
Backup/Recovery	Automated scripts	Managed service only	⚠️ Complete gap
Local-Only Mode	✅ Privacy-focused	❌ Not available	⚠️ Feature gap
Brain Systems	6 systems (full)	6 systems (tenant-aware)	✅ Enhanced
Governance	Basic maturity levels	Enhanced + RBAC + quotas	✅ Enhanced
Security	A- rated	A- rated + tenant isolation	✅ Enhanced

---

10.2 Strategic Recommendations

Immediate Actions (P0)

**Port Self-Hosting Documentation**

Merge upstream INSTALLATION.md sections
Add Docker Compose deployment guide
Document native installation
Create backup/recovery procedures

**Enhance Docker Compose**

Add frontend service
Add browser automation
Include health checks
Add volume persistence
Create development and production variants

**Restore Integration Coverage**

Port missing 35+ integrations
Implement local-only mode
Update OAuth setup guides

**Create Deployment Scripts**

Production setup automation
Backup automation
Monitoring setup
One-command start

Short-Term (P1)

**Add Monitoring Stack**

Prometheus setup
Grafana dashboards
Alert configuration

**Improve Testing**

E2E test suite
Property-based tests
Coverage reports

**Create CLI Tools**

atom init command
atom start/stop commands
atom daemon mode

Long-Term (P2)

**Maintain Parallel Documentation**

SaaS deployment guide
Self-hosted deployment guide
Migration guide (upstream → SaaS)

**Code Sharing Strategy**

Extract common brain systems to shared library
Maintain integration adapters in upstream
SaaS-specific code in main repo

---

10.3 Architecture Decision Record

**Decision:** Maintain dual architecture (SaaS + Self-Hosted)

**Rationale:**

Different market segments (SaaS customers vs self-hosters)
Shared core intelligence (brain systems)
Different deployment requirements
Revenue diversification

**Implementation:**

Keep atom-upstream as single-tenant reference
Enhance main repo with SaaS features
Port self-hosting improvements from upstream to main repo
Maintain integration parity

---

**Document Version:** 1.0

**Last Updated:** March 31, 2026

**Next Review:** April 30, 2026